Case study for Time series inference on beer production

Branislav Doubek

Imports

Data loading

Data exploration and feature creation

In this subsection we create various features starting from adding seasons

We can see that our dataset does not contain any NaN', which is great since we don't need to interpolate any values

First impression from the data

From the graph we can see that the time series is not stationary (mean is not time invariant, i.e. there exists trend )

Time series decomposition

Our data does consists of trends + seasonal effects + noise. \ Mathematically for additive model: $$ y(x(t)) = a_1*x_{trend}(t) + a_2*x_{seasonal}(t) + noise(t)$$ for multiplicative model: $$ y(x(t)) = a_1*x_{trend}(t) * a_2*x_{seasonal}(t) * noise(t)$$

Residuals plot

Monthly encoding

Data filtering

From our data analysis we saw that the trend of production is increasing until 1975, where it starts to oscilate. Our goal is to train correct model on the new data, so we are going to cutoff all data before 1975

Train test split

We are going to split our dataset into 2 parts - train and test set with distribution being 98% + and 2%

Modelling

In this subsection we are going to fit and cross-validate a multitude of models, which in the end of this notebook are going to be bootstrapped to create a combined prediction. We are going to use TimeSeries split with 5 folds.

SARIMA (Seasonal Autoregressive Integrated Moving Average)

Since our data not only contains trend, but also seasonal component we are not going to be using ARIMA model.

Parameters to tune in SARIMA:

SARIMA model can be parametrized using 7 parameters (First 3 are same for ARIMA, while the last 4 account for the seasonality in our model)

Seasonal parameters:

Source: https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html#statsmodels.tsa.statespace.sarimax.SARIMAX

In next subsection we are going to find the optimal hyperparameters for our model using hyperopt package.

Our model did not achieve the optimal performance, which is be due to the low amount of hyperparameter rounds. (time constranints)

Forecasting with SARIMA on 1996

Tensorflow probability

In the next subsection we are going to evaluate our probability model on testing set

source: https://www.tensorflow.org/probability/examples/Structural_Time_Series_Modeling_Case_Studies_Atmospheric_CO2_and_Electricity_Demand

Compare on the testing dataset

Forecasting with Probability model

Due to nature of the model (probabilistic) we can see that with increase in time our upper and lower interval is increasing, while the prediction is relatively ok

Modelling with Prophet

We are going to use out of the box ready model created by facebook called prophet.

Source: https://facebook.github.io/prophet/

Cross validation with Prophet

Forecasting with a Prophet model

Bootstrapping the predictions

We are going to train the models on the whole dataset (except for the data points before 1975) and forecast the production for the year 1996, while averaging predictions from our 3 models.

Since our data ends at 1995-08 we are going to predict 16 months before to cover the whole year 1996.

Results

Performance of our models correlates with ideas we have found out in our EDA (higher production during end of Autumn and Winter). This could mean, that we have succesfully forecasted beer production for the year 1996

Future ideas: